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SUMMARY  PAGE 


PROBLEM 

To  determine  whether  intense  tones  in  the  ^quency  region  around  1000  Hz  might  affect 
speech  recognition  in  background  noises  similar  to  those  found  on  ships. 

METHOD 

The  Speech  Intellibibility  Index  (SlI)  was  used  to  quantify  the  expected  acoustic  interference 
of  tones  around  1000  Hz  and  pink  noise.  The  tones  simulate  an  active-sonar  system  that  would 
radiate  back  through  the  ship’s  compartments.  The  pink  noise  simulates  shipboard  background 
noise. 

FINDINGS 

A  speech-recognition  model  predicted  that  speech-recognition  could  remain  high  in  the  pres¬ 
ence  of  the  intense  tonal  maskers,  but  that  speakers  would  have  to  raise  their  voices,  often  to  a 
shout,  in  order  to  maintain  intelligibility.  Results  with  actual  speakers  and  listeners  verified  this 
prediction. 

APPUCATION 

Intermittent,  intense  tones  should  not  create  undue  problems  for  speech  communication.  The 
tones  in  this  study  had  a  6-sec  duration  with  a  24-sec  rest  between  tones,  as  might  be  used  for  an 
active-duty  cycle.  If  the  tones  were  on  for  longer  durations,  however,  listeners  might  have  diffi¬ 
culty  maintaining  the  high  voice  levels  required  for  communication.  In  addition,  if  listeners  are 
working  on  complex  tasks  or  under  stress,  their  speech  understanding  might  decrease  markedly. 
Other  factors  that  could  markedly  decrease  speech  understanding,  especially  in  combination  with 
high  levels  of  background  noise  and  high  levels  of  task  complexity,  include  a  listener  with  a  hear¬ 
ing  loss,  a  speaker  with  unclear  speech,  or  a  poor  transmission  system  distorting  the  speech  sig¬ 
nal. 


ADMINISTRATIVE  INFORMATION 

This  research  was  carried  out  under  a  task  plan  entitled,  "Development  of  acoustic 
habitability  standards  for  ships’  spaces  subjected  to  intense  tones"  and  was  funded  by  Program 
Executive  Office  Surface  Ship  ASW  Systems  Task  No.  SSAS-91-77A01R2  dated  14  December 
1990  "AN/SYQ-1  Frequency  Array  testing".  Naval  Sea  Systems  Command  PMO  424.  The 
views  expressed  in  this  article  are  those  of  the  authors  and  do  not  reflect  the  official  policy  or 
position  of  the  Department  of  the  Navy,  Department  of  Defense,  or  the  U.S.  Government.  Hiis 
report  was  approved  for  publication  on  30  July  1993,  and  designated  as  NSMRL  Report  1188. 
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ABSTRACT 


Predictions  using  the  Speech  Intelligibility  Index  (SII)  are  reported  fiar  speech  recognition  in 
tonal  and  broadband  noise.  Three  levels  of  background  pink  noise  (60, 65,  and  70  dBA)  were 
used  for  the  SII.  Tones  in  the  frequency  range  around  1000  Hz  at  77, 83,  and  89  dB  SPL  were 
added  to  the  background  pink  noise.  The  speech  spectra  were  for  four  different  vocal  efforts 
(normal,  raised,  loud,  and  shouted).  The  simulated  listeners  for  most  of  the  predictions  were 
assumed  to  have  normal  hearing.  A  hearing  loss  was  also  included  for  a  subset  of  predictions.  If 
"barely  adequate"  speech  recognition  is  used  as  a  criterion,  the  effects  of  the  background  noise 
can  be  overcome  by  increasing  the  vocal  effort  of  the  speaker,  often  to  a  shout. 


[Blank  Page] 
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EVALUATION  OF  COMMUNICATION  DURING  ACTIVE  SONAR  TRANSMISSIONS 

WITH  A  SPEECH-RECOGNITION  MODEL 


The  purpose  of  the  present  investigation  was 
to  estimate  the  extent  to  which  active  sonar 
sounds  from  a  new  sonar  system  would  inter¬ 
fere  with  speech  recognition  on  ships.  Speech- 
recognition  performance  was  modeled  using 
the  Speech  Intelligibility  Index  (ANSI,  1992). 
The  Speech  Intelligibility  Index  (SII,  which 
was  formerly  known  as  Articulation  Index,  or 
AI)  is  the  proportion  of  the  total  speech  infor¬ 
mation  reaching  the  ear  of  the 
listener,  with  each  frequency  band  weighted 
by  its  relative  importance.  The  maximal  value 
of  the  SII  (1.0)  means  that  all  acoustic  infor¬ 
mation  is  present;  the  minimal  value  (0.0) 
means  that  no  acoustic  information  is  present. 
Similarly,  a  value  of  0  J  means  that  half  of  the 
information  is  present.  As  a  general  guideline, 
"excellent"  speech  understanding  occurs  for 
Sns  above  0.75,  "good"  is  between  0.6  and 
0.75,  "fair"  is  between  0.46  and  0.6,  "poor"  is 
between  0.3  and  0.45,  and  below  0.3  is  "bad" 
(lEC,  1988). 

In  a  noise  background,  the  proportion  of  the 
speech  that  is  above  the  noise  determines  the 
Sn  and,  thus,  speech  understanding.  A  funda¬ 
mental  effect  of  noise  on  speech  production  is 
that  vocal  effort  increases  with  increasing 
background  noise  levels.  As  vocal  effort 


increases,  there  are  tuo  changes  in  the  speech 
spectra.  The  first  is  that  the  overall  level  in¬ 
creases.  The  second  is  that  the  spectral  shape 
changes;  there  is  more  high-frequency  empha¬ 
sis  with  increased  vocal  effort  Thus,  it  is  im¬ 
portant  to  include  the  effect  of  vocal  effort  in 
estimating  speech  understanding  in  back¬ 
ground  noise. 

Pearsons  (personal  communication,  1988)  sug¬ 
gested  (based  on  work  under  contract  to  the 
Environmental  Protection  Agency)  that  peo¬ 
ple  raise  their  voice  to  maintain  95%  correct 
sentence  recognition,  which  is  equivalent  to 
an  SII  of  approximately  0.45.  In  general,  the 
lower  cut-off  for  barely  adequate  communica¬ 
tion  is  considered  to  be  0.45  or  0.46.  At  this 
sn,  scores  for  single  one-syllable  words  out  of 
context  are  considerably  lower  (around  70%  cor¬ 
rect)  than  for  sentences.  Thus,  communication 
situations  that  depend  on  single  words  out  of 
context  require  higher  SUs. 

In  the  present  paper,  we  provide  SUs  at  three 
background  levels  of  a  tonal  complex  used  to 
simulate  the  active  sonar  of  interest  Four  vo¬ 
cal  e^rts  are  presented  ranging  from  normal 
to  shouted  speech.  The  ejects  of  hearing  loss 
also  are  briefly  disctissed. 
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Method 

The  modeled  speech  signal  was  the  average 
speech  spectrum  of  male  and  female  speech  as 
measured  in  free  field,  one  meter  from  the 
talkers’  lips  tor  four  vocal  efforts  -  normal, 
raised,  loud,  and  shout  (Pavlovic,  Rossi,  and 
Espesser,  1990).  For  normal,  raised,  loud, 
and  shouted  vocal  efforts,  the  overall  levels  of 
speech  are  62.4  dB  SPL,  68.3  dB  SPL,  74.8 
dB  SPL,  and  82.3  dB  SPL  respectively. 

The  speech  spectrum  undergoes  two  addi¬ 
tional  changes  in  the  hearing  process.  The 
first  is  that  the  head  and  external  ear  (pinna 
and  ear  canal)  alter  the  spectrum  via  their  reso¬ 
nance  characteristics  (i.e.,  transfer  function). 
The  second  is  that,  in  the  inner  ear,  energy 
from  some  frequency  bands  may  spread  to  oth¬ 
ers,  especially  at  higher  intensity  levels  (i.e., 
masking).  The  same  factors  affect  the  noise 
spectrum.  The  present  model  uses  a  free-field- 
to-eardrum  transfer  function  (Bentler  and 
Pavlovic,  1989),  and  the  spread  of  masking 
across  bands  was  calculated  according  to  Lud- 
vigsen  (1985). 

At  higher  levels  of  speech  input,  increases  in 
the  speech  level  do  not  increase  the  speech 
recognition  at  the  same  rate  as  at  lower  levels. 
In  the  current  model,  a  decreasing  proportion 
of  the  speech  energy  contributed  to  the  SII  at 
higher  levels  (above  approximately  72  dB 
SPL). 


The  speech  and  noise  spectra  were  divided 
into  18  1/3-octave  bands  (with  center  frequen¬ 
cies  from  160  to  8000  Hz)  for  this  analysis. 
The  weighting  for  the  relative  importance  of 
each  frequency  band  in  the  speech  signal  is 
dependent  on  the  speech  materials  used.^  The 
importance  function  for  this  analysis  was  the 
frequency-band  weightings  averaged  across 
several  types  of  speech  materials  (e.g.,  non¬ 
sense  syllables,  monosyllabic  meaningful 
words,  easy  mnning  speech)  (Pavlovic,  1987). 

The  background  noise  used  in  our  SII  calcula¬ 
tions  had  two  components.  The  first  was  pink 
noise,^  which  approximates  the  measured 
backgrounds  in  many  compartments  of  sur¬ 
face  ships.  Three  levels  of  pink  noise,  60, 65, 
and  70  dBA  were  used.  The  levels  60  dBA 
and  65  dBA  are  maximum  permissible  levels 
of  continuous  broadband  noise  on  Naval  ves¬ 
sels  for  category  A-12  (talker- 
listener  distance  6  feet  or  greater)  and  C  (quiet 
essential;  e.g.,  sonar  and  medical)  spaces,  re¬ 
spectively  (CNO,  OPNAVINST  9640.1, 
1979).  The  70  dBA  level  is  specified  both  for 
A-3  spaces,  where  talker-listener  distances  are 
less  than  three  feet  (e.g.,  small  offices),  and 
for  areas  where  the  primary  consideration  is 
comfort  (e.g.,  berthing  areas  and  wardrooms). 
The  second  background  noise  component  was 
pure  tones  at  the  frequencies  of  720,  800, 880, 
960, 1040,  and  1120  Hz,  which  simulated  ac¬ 
tive  sonar  pings.  The  model  assumed  that  one 
of  these  tones  was  always  present,  with  each 
being  equally  represented. 


1  Speech  understanding  is  influenced  by  the  redundancy  of  the  speech;  the  greater  the  speech  redundancy,  the 
easier  it  is  to  understand  and  the  less  it  will  be  degraded  in  difficult  bstening  situations.  The  frequency 
distribution  of  the  usable  informational  content  of  die  signal  also  is  affected  by  the  speech  redundancy.  For 
example,  the  frequency  band  with  the  highest  weight  in  the  importance  function  for  a  typical  set  of  nonsense 
syllables  is  2500  Hz  while  for  a  typical  sample  of  running  speech  is  450  Hz.  See  Pavlovic  (1987)  for  more 
details. 

2  Pink  noise  has  a  continuous  frequency  spectrum  with  spectrum  level  decreasing  at  3  dB/octave,  which  has  the 
result  of  having  equal  energy  within  a  bandwiddi  proportional  to  the  center  frequency  of  the  band. 
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The  hypothetical  listeners  for  our  calculations 
had  normal  hearing  (0  dB  HL  thresholds  at  all 
frequencies)  and  were  listening  binaurally. 

Speech-recognition  data  also  were  collected 
on  Navy  enlisted  personnel  who  were  in  the 
laboratory  as  subjects  for  habitability  studies 
on  the  effects  of  the  active-sonar  pings.  All 
had  normal  hearing  thresholds  (less  than  or 
equal  to  15  dB  HL  from  125  through  8,000 
Hz).  Five  or  six  subjects  lived  in  the  labora¬ 
tory  at  the  same  time.  For  the  speech  tests, 
each  subject  was  the  speaker  for  one  50-word 
list  of  the  Modified  Rhyme  Test  (Kreul  et  al., 
1968).  The  monosyllabic  words  take  the  form 
of  consonant-vowel-consonant  (e.g.,  "bad"). 
The  test  has  a  six-alternative,  closed-set  test 
format  in  which,  for  each  test  item,  the  lis¬ 
tener  has  six  words  to  choose  from  (the  test 
word  and  five  foils,  which  differ  fi'om  the  test 
word  by  either  the  initial  or  final  consonant). 

Each  subject  practiced  the  assigned  list  by 
reading  it  aloud  to  the  experimenter.  Any  mis¬ 
pronunciations  were  corrected  during  the  prac¬ 
tice  time.  During  the  test,  the  talker  used  the 
phrase,  "Mark _ , _ ,  and _ please,"  us¬ 

ing  connected  speech  with  three  words  from 
the  list  to  approximate  normal  speaking  condi¬ 
tions.  For  each  list,  one  subject  was  the  talker, 
and  the  other  subjects  were  the  listeners.  The 
talkers  were  given  no  specific  instructions 
about  vocal  effort,  but  rather  were  asked  to 
speak  normally.  Most,  however,  tended  to 
speak  more  slowly  and  loudly  with  more  pre¬ 
cise  pronunciation  when  reading  the  word 
lists  to  the  group  in  the  test  situation  than  they 
did  in  informal  conversations. 

Details  of  the  physical  and  acoustical  charac¬ 
teristics  of  the  test  room  are  given  in 
Sylvester  (1993).  Both  talkers  and  listeners 
were  seated  on  their  beds  during  the  testing. 
The  subjects  were  given  no  instructions  about 


whether  to  watch  the  speaker.  Although 
watching  the  speaker  would  have  been  a  good 
strategy  because  visual  cues  aid  speech  under¬ 
standing,  the  subjects  looked  at  the  answer 
sheet  instead  of  the  talker. 

Testing  occurred  over  three  sequential  days. 
On  each  of  the  three  days,  the  subject  read  the 
same  list,  but  the  word  order  was  varied  on 
each  day.  On  the  first  day,  the  test  was  admin¬ 
istered  in  the  room  with  no  additional  noise 
added  (roughly  in  the  46-5 1  dB  A  average 
range).  On  the  second  day,  60  dBA  pink  noise 
was  added  to  the  room.  On  the  third  day,  both 
the  60  dBA  pink  noise  and  the  active-sonar 
pings  at  the  particular  level 
assigned  to  that  subject  group  were  present. 
For  two  groups,  two  additional  days  of  testing 
took  place  to  assess  learning  effects.  Day  4 
was  the  pink-noise  condition  (like  Day  2),  and 
Day  5  was  a  repeat  of  Day  1,  with  no  addi¬ 
tional  noise  added  to  the  room. 

Results 

A.-S1I  and  pings, 

SIls,  assuming  normal-hearing  listeners,  for 
60  dbA  background  noise  are  shown  in  Figure 
1  (left  panel)  (the  right  panel  shows  a  simu¬ 
lated  hearing  loss,  which  will  be  discussed  in 
the  next  section).  SIIs  for  65  and  70  dBA 
background  noise  are  shown  in  Figure  2. 

(All  SII  values  also  are  included  in  tabular 
form  in  the  Appendix).  Each  panel  of  the  fig¬ 
ures  shows  SIIs  for  the  pink  noise  alone  and 
with  three  levels  of  tones  -  77  dB  SPL,  83  dB 
SPL,  and  89  dB  SPL.  The  speech  spectra  are 
for  normal,  raised,  loud,  and  shouted  speech. 

In  order  to  maintain  barely  adequate  speech 
communication  (i.e.,  SII=  0.46  or  greater) 
in  a  60  dBA  noise  background  (Figure  1, 
left  panel),  speakers  must  use  a  raised  voice. 
With  77, 83,  or  89  dBA  tones  added  to  the 


3 


background  noise,  they  must  speak  loudly. 
The  speech  levels  predicted  from  the  SII 
model  are  consistent  with  observed  speech 
levels  of  talkers  during  a  speech  communica* 
tion  task. 


We  know  of  no  recently  published  studies  of 
the  hearing  levels  of  officers,  but  high- 
frequency  hearing-threshold  elevations  of 
this  magnitude  are  found  among  the  enlisted 
active-duty  population  (NEHC,  1990). 


In  order  to  maintain  barely  adequate  speech 
communication  in  the  65  dB  A  broadband 
noise  background  (Figure  2,  left  panel),  the 
speaker  must  use  a  raised  voice,  and,  when 
the  tones  are  added  at  77  and  83  dB  SPL, 
must  speak  loudly.  With  the  addition  of  an  89 
dB  SPL  tone,  the  speaker  must  shout.  In  70 
dBA  background  noise,  the  speaker  must 
speak  loudly,  and,  the  addition  of  the  tone  (at 
all  three  levels)  requires  the  speaker  to  shout. 


The  effects  of  a  hearing  loss  also  were  mod¬ 
eled.  Thresholds  of  20, 30, 35, 40, 55,  90,  and 
90  dB  HL  at  0.5, 1, 2, 3, 4,  6,  and  8  kHz,  re¬ 
spectively,  were  modeled  as  a  simple 
attenuation  (change  in  threshold  levels),  as 
per  the  draft  ANSI  standard  (ANSI,  1992). 


Figure  1  (right  panel)  shows  the  effect  of  the 
simulated  hearing  loss.  As  expected,  the  SI  Is 
are  lower  for  the  listener  with  hearing  loss. 
Also,  there  is  less  improvement  with  in¬ 
creased  vocal  effort  if  the  listener  has  a  hear¬ 
ing  loss.  The  high-frequency  acoustic  informa¬ 
tion  that  is  available  to  the  normal-hearing 
listener  at  increased  vocal  efforts  is  not  avail¬ 
able  to  a  listener  with  a  high-frequency  hear¬ 
ing  loss.  For  our  simulated  listener  with 
hearing  loss,  the  loss  at  4000-5000  Hz  was  the 
predominant  cause  of  the  decreased  SIIs. 
(Band-by-band  speech-to-noise  ratios 
illustrating  this  effect  are  given  in  the  Appen¬ 
dix  in  Table  AP-V.)  Noise-induced  hearing 
loss,  which  is  the  most  frequent  cause  of 
hearing  impairment  in  the  Navy,  typically  is 
maximal  at  4000  Hz.  As  can  be  seen  in  Figure 


EFFECTS  OF  TONE  IN  NOISE  ON  SII  AT 
FOUR  VOCAL  EFFORTS 


BACKGROUND  NOISE:  60dBA 


BACKGROUND  NOISE:  60dBA 
with  HEARING  LOSS 
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Figure  1.  SIIS  for  normal-hearing  listeners  (left  panel)  and  a  hearing  loss  acceptable  for  U.S.  Naval  officers 
(right  panel).  The  background  noise  in  both  cases  is  60  dBA  pink  noise  with  a  tonal  masker  (1000-Hz 
region)  added  at  77, 83,  and  89  dB  SPL  An  SII  of  0.46  is  considered  barely  adequate  communication. 


EFFECTS  OF  TONE  IN  NOISE  ON  SII  AT 
FOUR  VOCAL  EFFORTS 
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Figure  2.  SIIs  for  normal-hearing  listeners  in  65  (left  panel)  and  70  (right  panel)  dBA  pink  noise.  As  in 
Figure  1,  a  tonal  masker  (1000-Hz  region)  is  added  at  77. 83,  and  89  dB  SPL. 


1,  this  hearing-impaired  listener  requires 
greater  vocal  efifort  from  the  speaker  to  achieve 
an  SII  comparable  to  a  normal-hearing  lis¬ 
tener.  Note  that  in  the  60  dBA  background 
noise  used  in  our  simulations,  a  normal-hear¬ 
ing  listener  needs  a  raised  voice  level,  but  the 
hearing-impaired  listener  needs  a  loud  voice 
level  for  barely  adequate  communication. 

We  did  not  incorporate  the  increase  in 
upward  spread  of  masking  that  accompanies 
increasing  hearing  loss  (as  modeled  by  Lud- 
vigsen,  1985,  and  Humes,  Espinoza-Varas, 
and  Watson,  1988)  because  there  is  much 
individual  variability  in  the  size  of  this  effect, 
and  there  is  no  consensus  on  appropriate  mod¬ 
eling  of  hearing  loss.  Therefore,  our  calcula¬ 
tions  are  an  upper  limit  on  the  SB,  and  the 
performance  of  groups  of  hearing-impaired 
listeners  actually  would  be  expected  to  be 
somewhat  poorer  than  we  have  shown.  In 


addition,  there  is  variability  among  the  indi¬ 
vidual  listeners’  performance,  especially  for 
hearing-impaired  listeners.  That  is,  it  is  well 
known  that  some  hearing-impaired  individu¬ 
als  have  much  poorer  speech  recognition  than 
others  with  simliar  amounts  of  hearing  loss.^ 
These  individual  differences  may  well  be  ac¬ 
counted  for  by  differences  in  spread  of  mask¬ 
ing.  For  example,  Dubno,  He,  Schaefer,  and 
Ahlstrom  (1992)  found  that  taking  into 
account  individual  amounts  of  spread  of  mask¬ 
ing  for  SQ  calculations  greatly  improved  the 
ability  to  accurately  predict  the  relationship 
between  SII  values  and  speech-recognition 
scores.  Accurately  predicting  both  mean  data 
and  the  range  of  performance  for  hearing- 
impaired  listeners  remains  to  be  done. 


3  The  audiogram  describes  only  one  component  of  a  hearing  loss.  It  is  not  surprising  that  individuals  with 
similar  audiograms  differ  in  odier  aq)ects  of  hearing  such  as  frequency  dBcrimination  and  uncomfortable 
loudness  levek. 
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Table  1 

Word-recognition  scores  for  seventeed  subjects  in  three  background  noises.  The 
numbers  in  parentheses  below  the  scores  are  standard  errors  of  the  mean. 


Ambient  room  noise 

Pink  noise  (60  dBA) 

Pink  noise  (60  dBA)  + 

89  dB  pings 

90.1% 

82.7% 

78.6% 

(1.0) 

_ ihn _ 

_ (1-7) 
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C.  Actual  speech-recognition  scores  with  89 
dB  pings 

Word-recognition  scores  are  given  in  Table  1 
for  the  seventeen  subjects  who  listened  in  the 
presence  of  89  dB  pings.  Individual  scores 
were  computed  for  each  list,  and  all  the  scores 
from  each  individual  for  one  day  (one  condi¬ 
tion)  were  averaged.  Then  the  scores  for  all 
the  subjects  on  each  day  were  averaged. 
Because  these  were  essentially  unpracticed 
talkers/listeners,  practice  effects  were  seen  as 
determined  by  comparisons  of  the  scores  on 
Days  2  and  4  (pink  noise)  and  Days  1  and  5 
("quiet;"  i.e.,  ambient  room  noise)  for  the  two 
groups  in  which  scores  were  measured  for 
five,  rather  than  just  three,  days.  The  use  of  a 
closed-set  test  eliminated  the  listener  effects 
due  to  learning  the  speech  materials.  Another 
factor  that  affects  listeners,  however,  is  learn¬ 
ing  to  listen  in  particular  acoustic  conditions 
or  becoming  accustomed  to  a  talker’s  speech 
pattern.  Many  of  the  talkers  spoke  more 
clearly  and  at  higher  levels  as  they  became 
more  practiced.  In  order  to  correct  for  learn¬ 
ing  effects,  we  decided  to  reference  the  scores 
to  the  third  day  (ping  day).  A  correction  factor 
for  learning  was  determined  by  averaging  the 
scores  for  the  two  days  that  had  the  same 
condition  (Days  2  and  4,  and  Days  1  and  5), 
using  the  data  for  the  two  groups  (twelve 
subjects)  that  were  tested  across  five  days, 
and  then  subtracting  the  difference  between 
the  initial  test  for  a  particular  condition  (Day 


1  for  "quiet"  and  Day  2  for  "pink  noise")  from 
the  average.  This  correction  factor  was  added 
to  the  average  score  for  the  seventeen  sub¬ 
jects.  The  correction  factor  was  2%  for  ambi¬ 
ent  noise  only  (Day  1)  and  1.8%  for  the  pink 
noise  (Day  2). 

As  the  signal-to-noise  ratio  decreased,  the 
word-recognition  scores  decreased,  as  ex¬ 
pected.  They  would  have  decreased  more 
except  that  the  speakers  increased  their  vocal 
effort  as  the  background  noise  increased.  In 
no  condition  were  the  word-recognition 
scores  even  close  to  perfect.  The  primary  rea¬ 
son  probably  is  that  the  listeners  were 
always  listening  in  a  noise  background  at  lev¬ 
els  sufficient  to  mask  portions  of  the  speech. 

Our  impression  is  that  factors  such  as  regional 
accents  and  inattention  influenced  few  individ¬ 
ual  scores  and  thus  had  a  minimal  effect  on 
the  mean  scores. 

The  relative  decrements  across  conditions  is 
consistent  with  the  trend  predicted  by  the  SII 
analysis.  The  exact  decrement  in  percent  cor¬ 
rect  cannot  be  predicted  from  the  SII  values 
because  the  transfer  function  between  SII  and 
percent  correct  for  our  particular  set  of  speak¬ 
ers  and  speech  materials  is  unknown. 
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Conclusions 

Our  conclusion  is  that,  given  the  low  duty  cy¬ 
cle  (20%)  of  the  active  sonar,  (i.e.,  on  for  6 
seconds,  off  for  24  seconds),  speakers  probably 
can  compensate  for  the  interference  of  the 
tones,  even  at  89  dB,  by  increasing  the  vocal 
effort,  often  to  a  shout,  while  the  active  sonar 
is  activated.  If  the  sonar  had  a  higher  duty  cy¬ 
cle,  our  conclusion  would  become  more  con¬ 
servative  because  people  would  be  required  to 
maintain  greater  vocal  effort  for  longer  peri¬ 
ods  of  time.  Not  only  might  they  be  unwilling 
to  maintain  a  high  level  of  vocal  effort  over 
long  periods  of  time,  but  they  also  would  be¬ 
come  hoarse,  with  a  resultant  inability  to  main¬ 
tain  the  required  vocal  intensity. 

For  particular  applications,  there  are  other  fac¬ 
tors  that  need  to  be  considered.  All  of  these 
would  result  in  a  more  conservative  conclusion. 

(1)  We  have  assumed  that  the  talkers 
speak  clearly  and  distinctly.  Presumably, 
even  if  a  speaker’s  actual  speaking  style  is 
less  clear,  training  can  improve  speech 
clarity  in  noisy  backgrounds. 

(2)  Our  recommendations  assume  that  all 
listeners  have  normal  hearing.  However, 
hearing  losses  that  are  acceptable  accord¬ 
ing  to  Navy  standards  will  lower  the  SIIs 
if  the  hearing  loss  is  sufficient  to  filter  out 
speech  acoustical  information.  In  addition, 
many  listeners  with  hearing  losses  also 
have  more  upward  spread  of  masking  than 
predicted  by  this  SII  model.  The  effective 
result  is  greater  perceived  noise  for  the 
hearing-impaired  listeners  and  thus  lower 
SIIs.  The  actual  hearing  levels  of  person¬ 
nel  using  particular  spaces  should  be  taken 
into  account  in  determining  whether  com¬ 
munication  in  a  particular  space  will  be 
adequate. 


(3)  Speech  transmitted  through  communi¬ 
cation  systems  (which  often  are  band-lim¬ 
ited  and  distorted)  often  results  in  lower 
speech  recognition.  If  speech  under¬ 
standing  in  an  area  is  already  only  "barely 
adequate"  due  to  background  noise  levels, 
distortions  in  the  speech  signal  over  com¬ 
munication  systems  can  easily  decrease 
speech  understanding  to  unacceptable  lev¬ 
els.  In  addition,  the  talker  may  be  commu¬ 
nicating  fi-om  a  relatively  quiet 
environment  to  the  listener’s  noisy  envi¬ 
ronment  and  may  not  adequately  compen¬ 
sate  for  the  low  signal-to-noise  ratio  in  the 
listener’s  space. 

(4)  These  SIIs  assume  that  the  listener  is 
able  to  give  full  attention,  both  mentally 
and  visually,  to  the  talker.  If  performing  a 
complex  task  or  several  tasks  simultan¬ 
eously  (i.e.,  high  cognitive  load  or  divided 
attention  tasks),  or  if  performing  under 
stress,  speech  understanding  could  be  low¬ 
ered.  Also,  in  stressful  situations,  the  talk¬ 
ers’  speech  may  deteriorate;  they  may 
speak  more  rapidly  and  less  clearly.  If 
these  conditions  are  likely  to  occur  in 
particular  environments,  a  higher  SII 
may  be  required  for  reasonably  good 
communication. 

(5)  For  these  reasons,  a  more  stingent 
requirement  for  speech  should  be  consid¬ 
ered  if  understanding  must  be  quick  and 
accurate  during  the  six-second  active-so¬ 
nar  interval.  The  importance  of  good 
speech  recognition  has  been  recognized 
by  the  Chief  of  Naval  Operations,  who 
specified  that  "direct  speech  communica¬ 
tion  must  be  understood  with  minimal  er¬ 
ror  and  without  need  for  repetition"  in 
Category  A  spaces  (OPNAVINST  9640.1, 
1979).  Unfortunately,  this  description  is 
incongruent  with  the  permissible  back- 
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ground  noise  levels  in  this  same  docu¬ 
ment.  The  noise  levels  are  too  high  to 
permit  such  good  speech  recognition. 
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AEEENDIX 


$ 


4 


Note:  Tables  A-1  through  A-4  are  SIIs  at  several 
background  noise  levels.  These  SUs  are  plotted 
in  the  text  as  Figures  1  and  'I. 


Table  A-1 

SIIs  for  four  speech  levels  (from  normal  to  shout)  with  a  pink-noise  background  of  60 
dBA  and  tones  (sequentially  presented  tones  in  random  order  at  720,  800,  880,  960, 
1040,  and  1120  Mz)  at  77,  83,  and  89  dB  SPL.  The  listener  has  normal  hearing. 


MASKER 


SPEECH  LEVEL 

PINK  60 

PINK  60+  TONE  77 

PINK  60+  TONE  83 

PINK  60+  TONE  89 

NORMAL 

0.387 

0.255 

0.222 

0.189 

RAISED 

0.630 

0.450 

0.393 

0.324 

LOUD 

0.829 

0.657 

0.573 

0.479 

SHOUT 

0.883 

0.802 

0.740 

0.637 

Table  A-2 

SIIs  as  in  Table  I  except  with  a  pink-noise  background  of  65  dBA. 


MASKER 


SPEECH  LEVEL 

PINK  65 

PINK  65+ TONE  77 

PINK  65+  TONE  83 

PINK  65+ TONE  89 

NORMAL 

0.291 

0.188 

0.170 

0.156 

RAISED 

0.485 

0.379 

0.356 

0.321 

LOUD 

0.735 

0.632 

0.584 

0.527 

SHOUT 

0.922 

0.860 

0.817 

0.754 

Table  A-3 

SIIs  as  in  Table  I  except  with  a  pink-noise  background  of  70  dBA. 


MASKER 


SPEECH  LEVEL 

PINK  65 

PINK  65+ TONE  77 

PINK  65+  TONE  83 

PINK  65+ TONE  89 

NORMAL 

0.081 

0.080 

RAISED 

0.217 

0.189 

0.157 

LOUD 

0.429 

0.314 

SHOUT 

0.652 

0.596 

0.514 

Appendix  A-2 


Table  A-4 

SIIs  as  in  Table  I  except  with  a  hearing  loss  allowable  for  an  officer  in  the  US.  Navy. 


MASKER 


SPEECH  LEVEL 

PINK  60 

PINK  60+  TONE  77  PINK  60+  TONE  83  PINK  60+  TONE  89 

NORMAL 

ISi 

0.206 

0.176 

RAISED 

0.338 

0.272 

LOUD 

0.567 

0.483 

0392 

SHOUT 

0.685 

0.624 

0324 

Table  A-5 

Speech-to-noise  ratios  (SNRs)  modeled  (SII)  with  a  normal  vocal  effort,  pink  noise  at  60 
(WA,  and  tones  at  89  dB  SPL.  The  "noise”  was  either  the  pink  noise  or  the  hearing 
threshold,  whichever  was  greater.  The  hearing  loss  was  thresholds  of 20,  30,  35,  40,  55, 
90,  and  90  dBSPL  at  500, 1000,  2,000, 3,000,  4,000,  6,000,  and  8,000  Hz,  respectively. 
SNRs  lower  than  -14  dB  do  not  contribute  to  the  SII. 


1/3-Octave 

Band 

Number 

1/3-  Octave 

Band  Center 
Frequency  (Hz) 

SNR  for 

OdB  HL 
thresholds 

SNR  for 
hearing  loss 

Band 

Importance 

1 

160 

-0.2 

-0.2 

0.0083 

2 

200 

2.8 

2.8 

0.0095 

3 

250 

4.1 

4.1 

0.0150 

4 

315 

4,3 

4.3 

0.0289 

5 

400 

5.9 

5.9 

0.0440 

6 

500 

6.6 

6.6 

0.0578 

7 

630 

5.4 

5.4 

0.0653 

8 

800 

-38.1 

-38.1 

0.0711 

9 

1000 

-40.4 

-40.4 

0.0818 

10 

1250 

-0.7 

-0.7 

0.0844 

11 

1600 

-2.5 

-23 

0.0882 

12 

2000 

-4.3 

-43 

0.0898 

13 

2500 

-6.9 

-6.9 

0.0868 

14 

3150 

-8.1 

-8.1 

0.0844 

15 

4000 

-9.3 

-24.2 

0.0771 

16 

5000 

-12.3 

-463 

0.0527 

17 

6300 

-14.1 

-64.1 

0.0364 

18 

8000 

-143 

-643 

0.0185 

Appendix  A-3 
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